Skip to content

Add InstallPrometheus and InstallHelm lib steps#1237

Merged
LeonardCareer merged 2 commits into
v2from
leonarddu/lib-install-prometheus
Jun 30, 2026
Merged

Add InstallPrometheus and InstallHelm lib steps#1237
LeonardCareer merged 2 commits into
v2from
leonarddu/lib-install-prometheus

Conversation

@LeonardCareer

@LeonardCareer LeonardCareer commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds two new lib steps under kcl/lib/steps/k8s/:

  • install_helm.kInstallHelm: installs Helm v3 on the pipeline agent.
  • install_prometheus.kInstallPrometheus: installs the kube-prometheus-stack Helm chart (Operator + Prometheus + kube-state-metrics + Grafana + Alertmanager) into the current kubectl context.
import lib.steps.k8s.install_helm as helm
import lib.steps.k8s.install_prometheus as prom

helm.InstallHelm()
prom.InstallPrometheus(
    serviceConnection = SERVICE_CONNECTION,
    valuesFile        = "kcl/<scenario>/prometheus-values.yaml",
)

InstallHelm is its own step so other helm-based steps can reuse it.

Why

Multiple benchmark scenarios need to provision an in-cluster Prometheus on a freshly created AKS cluster — currently each scenario hand-rolls its own helm/kubectl-apply step. This step covers both cases discussed so far:

  • Xinwei's 15K-node bench (kube-prometheus-stack with cilium PodMonitor / PrometheusRule and KSM)
  • Leonard's S5 lease bench (lightweight: Operator + Prometheus only, all other components disabled in values.yaml)

What the caller must prepare

  1. Helm on PATH — call InstallHelm() in a prior step.
  2. Working kubeconfig — call azure.GetCredentials(...) in a prior step so kubectl works against the target cluster.
  3. values.yaml checked in to the caller's pipeline repo, e.g. kcl/<scenario>/prometheus-values.yaml. The path passed to valuesFile is repo-relative under $(Pipeline.Workspace)/s/. All workload-specific tuning (retention, storage, resource requests, scrape rules, PodMonitor/ServiceMonitor selectors, node selectors, tolerations, enabling/disabling Grafana/Alertmanager/KSM) lives in this file — see the chart's values.yaml for the full surface.
  4. Firewall egress to the chart registries (ghcr.io, quay.io, pkg-containers.githubusercontent.com, production.cloudflare.docker.com and *. of each) — caller handles this in their cluster-create step.
  5. Any PodMonitor / ServiceMonitor / PrometheusRule CRs the workload needs — caller kubectl apply -f ... after this step. Not lib's concern.

What InstallPrometheus does

  1. Adds / refreshes the prometheus-community helm repo.
  2. helm upgrade --install <releaseName> prometheus-community/kube-prometheus-stack with the caller's values file, --create-namespace, --wait, --atomic (rolls back cleanly on failure), and configurable timeout.

Parameters

Parameter Default Required
serviceConnection yes
valuesFile yes
namespace "monitoring" no
releaseName "prometheus" no
chartVersion "" (latest) no
waitTimeout "10m" no

Validation

End-to-end-tested in the Telescope S5 lease benchmark pipeline (internal repo) against an AKS H8 hyperscale cluster in southeastasia:

  • helm install completes in ~2.5 min
  • the installed svc/prometheus-kube-prometheus-prometheus is port-forwardable and serves /-/ready, /api/v1/query, /api/v1/query_range
  • caller's additionalScrapeConfigs scrape config picks up the workload pods correctly

@xinWeiWei24 xinWeiWei24 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment thread kcl/lib/steps/k8s/install_prometheus.k Outdated
Comment thread kcl/lib/steps/k8s/install_prometheus.k Outdated
Comment thread kcl/lib/steps/k8s/install_prometheus.k Outdated
@LeonardCareer LeonardCareer changed the title Add InstallKubePrometheusStack lib step Add InstallPrometheus and InstallHelm lib steps Jun 30, 2026
@LeonardCareer LeonardCareer merged commit 550d15f into v2 Jun 30, 2026
1 check passed
@LeonardCareer LeonardCareer deleted the leonarddu/lib-install-prometheus branch June 30, 2026 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants